match_making

“Oh, but nonsense, she thought; William must marry Lily. They have so many things in common. Lily is so fond of flowers. They are both cold and aloof and rather self-sufficing. She must arrange for them to take a long walk together.” (Virginia Woolf)

Neuron Matching Making

Insect brains seem pretty stereotyped. But just how stereotyped are they? It comes as a suprise to many neuroscientists who work only on vertebrates, to learn that in insects, individual neurons can readily and reliably be re-found and identified across different members of the species. Perhaps even across species.

match_examples

match_examples

As of 2020, two large data sets for the vinegar fly, D. melanogaster, are available making it possible to look at the full morphology of ~25,000 neurons in two data sets. These data sets are the hemibrain and FAFB. However, neurons in FAFB have been semi-manually or manually reconstructed, making the automatic assignment of FAFB-hemibrain neuron matches non-trivial. In this R package we have built tools to enable users to record and deploy inter-dataset matches.

What use is this information? Matches could be used to look at morphological stereotypy, help find genetic lines that label neurons, help transer information associated with on reconstructed to the same cell in a different brain, compare neuron connectivity between two brains, etc.

For example, by matching neurons up between the hemibrain and FAFB, we see that the numbers of cell types within one ‘hemilineage’ (se set of neurons that are born and develop together) are comparable between these two different flies:

hemilineage_example

hemilineage_example

What You need

In order to use these tools you will need to have RStudio and to have installed the natverse. To use them to maximum effect, you will need to also have permission to access the FAFB CATMAID v14 project, although some neurons are available to be read by the public from Virtual Fly Brain’s CATMAID project for FAFB and should have the same unique skeleton ID numbers. Our pipeline function makes use of the Google Filestream application, which should be installed on your machine. Further, note that neurons are read from the FAFB CATMAID project, and you must have login details for this project recorded in your .Renviron (edit with: usethis::edit_r_environ) for these functions to work. For help, see here.

Authorisation

In order to write neuron matches to the project you must have access to the hemibrain Google Drive or the match making Google sheet (see below) owned by the Drosophila Connectomics Group. If you do not have access but would like to help or use this information, get in contact! You do not need programming skills to to help us match make neurons, as we have written an interactive pipeline in R which does most of the work for you (see below).

We regularly also up-date a data frame saved in this package, as a snapshot of matches that have been made. Without authorisation you can access these matches but they may not be the most up-to-date:

The Google Sheet

We in the Drosophila Connectomics Group have been recording our match making in a Google sheet named em_matching. This sheet has two tabs of concern here, hemibrain for hemibrain neuron -> FAFB neuron matches and fafb for FAFB neuron to hemibrain neuron matches.

google_sheet

google_sheet

If you have authorisation, you can see the most up-to-date matches as so:

As you can see, other meta information is present in the data frame matches. The function hemibrain_matches has an argument called priority. This specifies whether to use FAFB->hemibrain matches (FAFB) or hemibrain->FAFB matches (hemibrain) in order to ascribe cell type names to FAFB neurons. In both cases, cell type names are attached to hemibrain bodyids, and propagated to their FAFB matches.

Match Quality

Once a match is recorded, the user selects a quality for that match. There can be no match (none), a tract-only match (tract) a poor match (poor) an okay match (medium) or an exact match (good). As a rule of thumb, a poor match could be a neuron from a very similar same cell type or a highly untraced neuron that may be the correct cell type. An okay match should be a neuron that looks to be from the same morphological cell type but there may be some discrepancies in its arbour. A good match is a neuron that corresponds well between FAFB and the hemibrain data.A tract only match just means that the matched neuron should share the same cell body fiber, and therefore same developmental ontogeny, even if the rest of its morphology is quite different.

It is very important to note that a match cannot be a match if neurons do not seem to share the same cell body fiber tract. Being in a different tract is a deal breaker.

Some good matches are striking. For example:

good_match_1

good_match_1

In the above case, the FAFB neuron has been quite extensively manually traced, meaning that these cells look very similar to one another.

Be aware that while neurons must share the same cel lbody fiber tract, these tracts can be a little off set. For example, this is also a good match:

good_match_2

good_match_2

If the some is missing, it might be safer to note a match as ‘medium’.

medium_match_1 You might also use medium if you have a nice looking match and suspect that there is a medium/large discrepancy because the FAFB neuron (here shown in red) is under-traced, such as:

medium_match_2

medium_match_2

Or:

medium_match_3 Bear in mind that the hemibrain volume only covers ~1/4 of the fly mid-brain, so neurons are trunacted (here hemibrain neuron in black) but we can still make matches for many of them:

medium_match_4

medium_match_4

A larger degree of under-tracing may lead you to assign a match as poor. In this case, you think the two neurons may be ‘the same isomorphic cell type’ but you could be wrong. For example:

poor_match_1

poor_match_1

A poor match may also be made if you think these is a slight offset, possibly due to a registration issue:

poor_match_2

poor_match_2

Though in this case, choosing a even lesser-traced FAFB neuron may be better:

poor_match_3

poor_match_3

A poor match can be given even to very under-traced FAFB neurons:

poor_match_4

poor_match_4

And even fragments if you are convinced the morphology is unique enough (but be careful!):

poor_match_5

poor_match_5

Adding Your Own Matches

So far we have matched up a few thousand neurons. About 25 thousand matches are possible because that is the number of reconstructed neurons in the hemibrain data set. You can help us (and yourself!) by adding matches to our database. There are two main ways of doing this:

Adding Ready-Made Matches

You can add matches you have already made by your own means. For this, you will need to get a data frame into R (e.g. reading from a .csv file) that has three columns: bodyid, which contains the hemibrain neurons’ unique Body IDs, skid which has the skeleton IDs for FAFB CATMAID neurons, and quality which gives a qualitative assessment of match quality (see above). If in doubt, put poor.

Sometimes you cannot add a match, as your neuron either does not exist in the first column of the hemibrain tab of our Google sheet or of the fafb sheet. In these cases, if you have a valid ID, you can either add it to the sheet manually, or programmatically so that all the right meta data is easily included:

The Match Making Pipeline

You can also use our interactive pipeline to match neurons between hemibrain and FAFB. There are two version of this pipeline. One that takes hemibrain neurons from neuPrint, and tries to find the best match for each hemibrain neuron (hemibrain_matching) and one that takes FAFB neurons from CATMAID and tries to find the best heimibrain match for those FAFB neurons (fafb_matching). Once matches are made, the result become available with hemibrain_matches.

For a video tutorial, see here.

When you run these functions you will enter an interactive pipeline in an rgl window. Prompts will be given to you in your R console and you can rotate and pan in the window to see neurons. The neuron selected for-matching is shown in black (i.e. if using hemibrain_matching this will be a hemibrain neuron), and potential matches in colour (i.e. if using hemibrain_matching these will be FAFB neurons). Potential matches are shown by NBLAST score (a measure of morphological similarity). Usually, for reasonably traced FAFB neurons, a good match appears in the top 10 hits.

pipeline_console

pipeline_console

Uses

One use we have already found for all of this match making, is to cross-identify neuron cell body fiber tracts and (hemi)lineages. This means that we now have the locations in FAFB for different known sets of cells. You can see seed planes for them here.

google_sheet_lineages

google_sheet_lineages